Merging Duplicate Bug Reports by Sentence Clustering
ثبت نشده
چکیده
Duplicate bug reports are often unfavorable because they tend to take many man hours for being identified as duplicates, marked so and eventually discarded. In this time, no progress occurs on the program in question, and is justifiably an overhead which should be minimized. Considerable research has been carried out to alleviate this problem. Many methods have been proposed for bug report categorization and duplicate bug report detection. However, it is often the case that a duplicate bug report can provide some additional information about a problem which could help in faster resolution of the bug. We propose that duplicate bug reports be merged when possible instead of being discarded, so that maximum information is captured. We propose a clustering-based algorithm to group together similar sentences and create a union of bug reports considered duplicates of each other.
منابع مشابه
An Exploratory Study of Duplicate Bug Reports in Oss Projects
Open Source Software (OSS) uses open bug repository during development and maintenance, so that both developer and user can reports bugs that they have found. These systems are generally called as bug tracking system or bug repositories. Bug tracking system is open bug repository that is maintained by open source software organizations to track their bugs. In OSS bug reports from all over the w...
متن کاملAssisted Detection of Duplicate Bug Reports
Duplicate bug reports, reports which describe problems or enhancements for which there is already a report in a bug repository, consume time of bug triagers and software developers that might better be spent working on reports that describe unique requests. For many open source projects, the number of duplicate reports represents a significant percentage of the repository, numbering in the thou...
متن کاملDuplicate bug reports considered harmful ... really?
In a survey we found that most developers have experienced duplicated bug reports, however, only few considered them as a serious problem. This contradicts popular wisdom that considers bug duplicates as a serious problem for open source projects. In the survey, developers also pointed out that the additional information provided by duplicates helps to resolve bugs quicker. In this paper, we th...
متن کاملPerformance of IR Models on Duplicate Bug Report Detection: A Comparative Study
Open source projects incorporate bug triagers to help with the task of bug report assignment to developers. One of the tasks of a triager is to identify whether an incoming bug report is a duplicate of a pre-existing report. In order to detect duplicate bug reports, a triager either relies on his memory and experience or on the search capabilties of the bug repository. Both these approaches can...
متن کاملA Bug Triage and Localization Technique based on Bug Reports Classification
With a great number of software products that have been developed, bug fixing is difficult due to a large number of submitted bug reports each day. Sometimes developers usually describe the same errors in the different bug reports, these bug reports are called duplicate bug reports, the increasing number of duplicates lead to a large amount of time and effort for identifying and analyzing bug r...
متن کامل